Goto

Collaborating Authors

 stock movement prediction


A Additional Results

Neural Information Processing Systems

The acronym dataset is a QA task that requires models to decode financial acronyms. The FinMA7B-full model achieved the highest ROUGE-1 score of 0.12 and the B.1 Why was the datasheet created? B.2 Has the dataset been used already? If so, where are the results so others can compare (e.g., links to published papers)? Y es, the dataset has already been used. It was employed in the FinLLM Share Task during the FinNLP-AgentScen Workshop at IJCAI 2024, known as the FinLLM Challenge.





RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models

Lin, Xueyuan, Yang, Cehao, Ma, Ye, Li, Ming, Zhang, Rongjunchen, Ni, Yang, Wu, Xiaojun, Xu, Chengjin, Guo, Jian, Xiong, Hui

arXiv.org Artificial Intelligence

Recently, large language models (LLMs) have demonstrated outstanding reasoning capabilities on mathematical and coding tasks. However, their application to financial tasks-especially the most fundamental task of stock movement prediction-remains underexplored. We study a three-class classification problem (up, hold, down) and, by analyzing existing reasoning responses, observe that: (1) LLMs follow analysts' opinions rather than exhibit a systematic, independent analytical logic (CoTs). (2) LLMs list summaries from different sources without weighing adversarial evidence, yet such counterevidence is crucial for reliable prediction. It shows that the model does not make good use of its reasoning ability to complete the task. To address this, we propose Reflective Evidence Tuning (RETuning), a cold-start method prior to reinforcement learning, to enhance prediction ability. While generating CoT, RETuning encourages dynamically constructing an analytical framework from diverse information sources, organizing and scoring evidence for price up or down based on that framework-rather than on contextual viewpoints-and finally reflecting to derive the prediction. This approach maximally aligns the model with its learned analytical framework, ensuring independent logical reasoning and reducing undue influence from context. We also build a large-scale dataset spanning all of 2024 for 5,123 A-share stocks, with long contexts (32K tokens) and over 200K samples. In addition to price and news, it incorporates analysts' opinions, quantitative reports, fundamental data, macroeconomic indicators, and similar stocks. Experiments show that RETuning successfully unlocks the model's reasoning ability in the financial domain. Inference-time scaling still works even after 6 months or on out-of-distribution stocks, since the models gain valuable insights about stock movement prediction.






Assessing the Capabilities and Limitations of FinGPT Model in Financial NLP Applications

Djagba, Prudence, Odinakachukwu, Chimezie A.

arXiv.org Artificial Intelligence

The financial industry has long been a pioneer in adopting cutting-edge technologies to enhance operational efficiency, accuracy, and strategic decision-making [2]. With the exponential growth of structured and unstructured data, particularly from news feeds, earnings reports, disclosures, and social media, there is an increasing demand for intelligent systems capable of processing human language at scale [11]. Initially, the industry relied on rule-based approaches and traditional statistical techniques such as bag-of-words and TF-IDF [28], which offered limited semantic understanding. As noted by Abubakar et al.[1], these limitations triggered a shift toward machine learning and deep learning models that, while better at capturing patterns, still required substantial domain-specific feature engineering. This landscape was significantly transformed with the introduction of transformer-based architectures, most notably the Generative Pre-trained Transformer (GPT) family [5]. These models demonstrated the power of large-scale pretraining followed by task-specific fine-tuning, enabling generalization across diverse NLP tasks. Models such as GPT-3, GPT-4, BERT, and T5 have delivered state-of-the-art results in sentiment analysis, summarization, question answering, and named entity recognition [13]. Beyond LLMs, the broader field of Generative AI (GAI)--including GANs, V AEs, and diffusion models--has found increasing relevance in finance, facilitating applications such as synthetic data generation, automated reporting, and scenario simulation [32, 31]. LLMs have emerged as essential tools in processing unstructured financial text, especially models fine-tuned on finance-specific corpora like FinBERT, BloombergGPT, and FinGPT [4, 39].